Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Sep;7(9):e1002150.
doi: 10.1371/journal.pcbi.1002150. Epub 2011 Sep 15.

A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes

Affiliations

A detailed history of intron-rich eukaryotic ancestors inferred from a global survey of 100 complete genomes

Miklos Csuros et al. PLoS Comput Biol. 2011 Sep.

Abstract

Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6-7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Reconstruction of intron gains and losses in the evolution of eukaryotes and intron density in ancestral eukaryote forms.
Branch widths are proportional to intron density which is shown next to terminal taxa and some deep ancestors, in units of the introns count per 1 kbp coding sequence. Human (Hsap) is marked by a blue dot. Edges are colored by the relative amount of intron gain and loss, as indicated in the inset scatter plot where each point corresponds to an edge in the tree. Gain% is the percentage of introns gained in the given lineage from the parent node; loss% is the percentage of the parent's introns lost within the same lineage. Species names and abbreviations: Aureococcus anophagefferens (Aano), Aedes aegypti (Aaeg), Agaricus bisporus (Abis), Anopheles gambiae (Agam), Allomyces macrogynus ATCC 38327 (Amac), Apis mellifera (Amel), Aspergillus nidulans FGSC A4 (Anid), Acyrthosiphon pisum (Apis), Arabidopsis thaliana (Atha), Babesia bovis (Bbov), Batrachochytrium dendrobatidis (Bden), Branchiostoma floridae (Bflo), Botryotinia fuckeliana B05.10 (Bfuc), Brugia malayi (Bmal), Bombyx mori (Bmor), Coccomyxa sp. C-169 (C169), Chlorella sp. NC64a (C64a), Caenorhabditis briggsae (Cbri), Caenorhabditis elegans (Cele), Coprinopsis cinerea okayama7#130 (Ccin), Cochliobolus heterostrophus C5 (Chet), Coccidioides immitis RS (Cimm), Ciona intestinalis (Cint), Cryptococcus neoformans var. neoformans (Cneo), Chlamydomonas reinhardtii (Crei), Capitella teleta (Ctel), Capsaspora owczarzaki ATCC 30864 (Cowc), Dictyostelium discoideum (Ddis), Dictyostelium purpureum (Dpur), Drosophila melanogaster (Dmel), Drosophila mojavenis (Dmoj), Daphnia pulex (Dpul), Danio rerio (Drer), Entamoeba dispar (Edis), Entamoeba histolytica (Ehis), Emiliania huxleyi (Ehux), Fragilariopsis cylindrus (Fcyl), Phanerochaete chrysosporium (Fchr), Phaeodactylum tricornutum (Ftri), Gallus gallus (Ggal), Gibberella zeae PH-1 (Gzea), Hydra magnipapillata (Hmag), Helobdella robusta (Hrob), Homo sapiens (Hsap), Ixodes scapularis (Isca), Laccaria bicolor (Lbic), Lottia gigantea (Lgig), Micromonas sp. RCC299 (M299), Monosiga brevicollis (Mbre), Mucor circinelloides (Mcir), Mycosphaerella fijiensis (Mfij), Mycosphaerella graminicola (Mgra), Magnaporthe grisea 70-15 (Mgri), Melampsora laricis-populina (Mlar), Micromonas pusilla CCMP1545 (Mpus), Neurospora crassa OR74A (Ncra), Nematostella vectensis (Nvec), Nasonia vitripennis (Nvit), Ostreococcus sp. RCC809 (O809), Ostreococcus lucimarinus (Oluc), Oryza sativa japonica (Osat), Ostreococcus taurii (Otau), Phytophthora capsici (Pcap), Plasmodium falciparum (Pfal), Puccinia graminis (Pgra), Pediculus humanus (Phum), Phaeosphaeria nodorum SN15 (Pnod), Physcomitrella patens subsp. patens (Ppat), Phytophthora ramorum (Pram), Pyrenophora tritici-repentis Pt-1C-BFP (Prep), Proterospongia sp. ATCC 50818 (Prsp), Phytophthora sojae (Psoj), Paramecium tetraurelia (Ptet), Plasmodium vivax (Pviv), Plasmodium yoelii yoelii (Pyoe), Rhizopus oryzae (Rory), Sorghum bicolor (Sbic), Saccharomyces cerevisiae (Scer), Schizosaccharomyces japonicus yFS175 (Sjap), Schistosoma mansoni (Sman), Selaginella moellendorffii (Smoe), Schizosaccharomyces pombe (Spom), Spizellomyces punctatus DAOM BR1173 (Spun), Strongylocentrotus purpuratus (Spur), Sporobolomyces roseus (Sros), Sclerotinia sclerotiorum 1980 UF-70 (Sscl), Trichoplax adhaerens (Tadh), Theileria annulata (Tann), Tribolium castaneum (Tcas), Toxoplasma gondii (Tgon), Taenopygia guttata (Tgut), Theileria parvum (Tpar), Thalassiosira pseudonana (Tpse), Tetrahymena thermophila (Tthe), Ustilago maydis 521 (Umay), Uncinocarpus reesii 1704 (Uree), Volvox carteri (Vcar), Vitis vinifera (Vvin).
Figure 2
Figure 2. Inferred ancestral intron densities and confidence intervals.
The plots for 9 key ancestral forms show the posterior distributions of the ancestral intron density inferred from the sampling chains. On each plot, the horizontal red line shows the median (the dot) and the 95% (+/−47.5%) confidence interval around it, estimated from 50,000 sampled MCMC steps.
Figure 3
Figure 3. Inferred intron site histories in prohibitin orthologs (KOG3083).
The tree from Figure 1 is used as the template for the reconstruction. Vertical bars are placed at intron sites proportionally along the X axis within the bars with respect to the underlying alignment. The height of green bars is proportional to the probability of intron presence; the height of red bars is proportional to the probability of intron gain in the lineage leading to the node.

References

    1. Roy SW, Gilbert W. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat Rev Genet. 2006;7:211–221. - PubMed
    1. Rodriguez-Trelles F, Tarro R, Ayala FJ. Origin and Evolution of Spliceosomal Introns. Annu Rev Genet. 2006;40:47–76. - PubMed
    1. Nixon JE, Wang A, Morrison HG, McArthur AG, Sogin ML, et al. A spliceosomal intron in Giardia lamblia. Proc Natl Acad Sci U S A. 2002;99:3701–3705. - PMC - PubMed
    1. Simpson AG, MacQuarrie EK, Roger AJ. Eukaryotic evolution: early origin of canonical introns. Nature. 2002;419:270. - PubMed
    1. Vanacova S, Yan W, Carlton JM, Johnson PJ. Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis. Proc Natl Acad Sci U S A. 2005;102:4430–4435. - PMC - PubMed

Publication types