Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 Sep;14(9):1686-95.
doi: 10.1101/gr.2615304.

Comparative analysis of apicomplexa and genomic diversity in eukaryotes

Affiliations
Comparative Study

Comparative analysis of apicomplexa and genomic diversity in eukaryotes

Thomas J Templeton et al. Genome Res. 2004 Sep.

Abstract

The apicomplexans Plasmodium and Cryptosporidium have developed distinctive adaptations via lineage-specific gene loss and gene innovation in the process of diverging from a common parasitic ancestor. The two lineages have acquired distinct but overlapping sets of surface protein adhesion domains typical of animal proteins, but in no case do they share multidomain architectures identical to animals. Cryptosporidium, but not Plasmodium, possesses an animal-type O-linked glycosylation pathway, along with >30 predicted surface proteins having mucin-like segments. The two parasites have notable qualitative differences in conserved protein architectures associated with chromatin dynamics and transcription. Cryptosporidium shows considerable reduction in the number of introns and a concomitant loss of spliceosomal machinery components. We also describe additional molecular characteristics distinguishing Apicomplexa from other eukaryotes for which complete genome sequences are available.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Higher-order relationships between eukaryotes (having complete genome sequence information) rooted with archaeal orthologs, as inferred from a concatenated alignment of 30 highly conserved proteins. The circles indicate bootstrap supports >85% (or Bayesian posterior probability > 0.9) obtained by the full ML (Proml), Puzzle ML, weighted neighbor-joining, parsimony, and minimum evolution methods. Bacterial and archaeal branches are in gray and eukaryotic branches are in black. (B) Plant affinities of apicomplexan proteins, glucose-6-phosphate isomerase, and phosphoenolpyruvate carboxylase. (C) Bacterial affinities of apicomplexan proteins, SBMA and thymidine kinase. (D) Animal affinities of apicomplexan proteins, SCP and MAM-domain-containing proteins. In these cases, the circles indicate boostrap support >85% by ML distance analysis (with Puzzle), RellBP, and neighbor-joining methods. Proteins are represented by their gene names and specific names. Some are abbreviated for convenience. Species abbreviations are: (Afu) Archaeoglobus fulgidus; (Ape) Aeropyrum pernix; (Ath) Arabidopsis thaliana; (Bb) Borrelia burgdorferi; (Ce) Caenorhabditis elegans; (Cpa) Cryptosporidium parvum; (Dm) Drosophila melanogaster; (Dr) Deinococcus radiodurans; (Gila) Giardia lambia; (Hs) Homo sapiens; (Pa) Pseudomonas aeruginosa; (Pfa) Plasmodium falciparum; (Sce) Saccharomyces cerevisiae; (Spo) Schizosaccharomyces pombe.
Figure 2
Figure 2
(A) Orthology coefficients across different protein functional classes. The overall OC refers to comparison of all proteins within the two apicomplexan proteomes. Surface proteins (or extracellular secreted proteins) are defined as those proteins that contain a predicted signal peptide sequence, lack of ER retention signals, and in many instances, contain transmembrane regions, globular cysteine-rich domains, or known surface protein domains. Note that smaller OC values are observed in the Apicomplexa for functional classes such as chromatin dynamics and splicing. Comparison in different eukaryotes of the number of domains per protein (B) and number of types of domains per protein (C). (Cpa) Cryptosporidium parvum; (Pfa) Plasmodium falciparum; (Sc) Saccharomyces cerevisiae; (At) Arabidopsis thaliana; (Hs) Homo sapiens. Comparison of the demography of the most prevalent conserved domains Plasmodium versus Cryptosporidium (D) and Cryptosporidium/Plasmodium versus Yeast (E) by means of scatterplots. The number of proteins containing an occurrence of 190 commonly found regulatory protein domains were determined in each of the proteome using a library of PSI-BLAST profiles of these domains. The number was then plotted as a scatterplot with each organism being compared representing one of the axes. In each graph, the equivalence lines, which have a slope equal to the ratio of the two proteomes being compared, are shown. Points below the equivalence are overrepresented in the organism on the x-axis, whereas points above the equivalence line are overrepresented in the organism on the y-axis.
Figure 3
Figure 3
(A) Domain organizations of a representative set of surface proteins from Cryptosporidium parvum (top panel) and orthologs common to Plasmodium falciparum (bottom panel). All proteins shown here have a signal peptide sequence represented by a yellow rectangular box at the beginning of the architectures. The domains are labeled as in Supplemental Table 1. Those not shown in Supplemental Table 1 are (Ank) ankyrin repeat; (CYS) cysteine-rich repeats found in Archaeoglobus proteases; (M) mucophorin domain (with Thr/Ser stretches indicated by gray boxes; see Supplemental Fig. 1); and (TM) membrane-spanning region. (B) Schematic representation of the reconstructed glycosylation pathways in Apicomplexa. The enzymes are shown in boxes along with the protein names of the respective yeast homologs. The reconstructed oligosaccharide chain is shown using abbreviations for the various sugars. (Glc) Glucose; (Gal) galactose; (Man) mannose; (GlcNAC) N-acetylglucosamine; (GalNAC) N-acetylgalactosamine; (Dol) dolichol; and (Ino) inositol. (X?) The uncharacterized sugar added by the WcaK-like glycosyltransferases. Wherever Cryptosporidium contains an enzyme of the pathway, it is indicated with a C in red, and Plasmodium is indicated with a P in black.
Figure 3
Figure 3
(A) Domain organizations of a representative set of surface proteins from Cryptosporidium parvum (top panel) and orthologs common to Plasmodium falciparum (bottom panel). All proteins shown here have a signal peptide sequence represented by a yellow rectangular box at the beginning of the architectures. The domains are labeled as in Supplemental Table 1. Those not shown in Supplemental Table 1 are (Ank) ankyrin repeat; (CYS) cysteine-rich repeats found in Archaeoglobus proteases; (M) mucophorin domain (with Thr/Ser stretches indicated by gray boxes; see Supplemental Fig. 1); and (TM) membrane-spanning region. (B) Schematic representation of the reconstructed glycosylation pathways in Apicomplexa. The enzymes are shown in boxes along with the protein names of the respective yeast homologs. The reconstructed oligosaccharide chain is shown using abbreviations for the various sugars. (Glc) Glucose; (Gal) galactose; (Man) mannose; (GlcNAC) N-acetylglucosamine; (GalNAC) N-acetylgalactosamine; (Dol) dolichol; and (Ino) inositol. (X?) The uncharacterized sugar added by the WcaK-like glycosyltransferases. Wherever Cryptosporidium contains an enzyme of the pathway, it is indicated with a C in red, and Plasmodium is indicated with a P in black.
Figure 4
Figure 4
Eukaryotic tree showing select points of derivation and loss of various architectures. The proteins are designated by either Plasmodium or Cryptosporidium gene names shown below a cartoon of their architecture. Gray boxes indicate globular domains that are not detected elsewhere. Yellow boxes indicate transmembrane segments or signal peptides. The domains found in chromatin proteins are (Ch) chromo domain; (Br) bromo domain; (PHD) PHD finger; (SET) SET protein methyltransferase domain; (CCC) cysteine cluster associated with SET domains; (AT) AT hook domain; (HTH) helix-turn-helix domain; (MYB) MYB-type HTH domain; (TFIIB) TFIIB-like HTH domain; (OB) oligomer-binding domain; (SWI2/SNF2) ATPase module of chromatin-remodeling proteins; (HAS) domain found in SWI2/SNF2 ATPases. The signaling domains are MYND and MIZ-Zn-finger domains; (R) ring finger domain; (ANK) ankyrin domain; (WD) WD40 β-propeller domain; (Kinase) protein kinase domain; (SAM) sterile α-motif domain; (Sec7) ARF GTPase exchange factor domain; (TBC) GTPase-activating domain; (MORN) a β-hairpin repeat motif; (POZ) pox virus zinc finger domain; (MATH) meprin-A5-TRAF homology domain; (EF) EF-hand domain. RNA-binding domains are (G-patch) glycine-containing RNA-binding domain; (SWAP) suppressor of white apricot domain; (RRM) RNA recognition motif domain. Other domains are (YbeY) predicted metal-dependent lecithinase domain; (ACP) acyl carrier protein domain. cgd8_2430 is a predicted MAP kinase, cgd5_4390 kinase shows a lineage-specific expansion in Plasmodium, and cgd3_2010 is predicted to be a novel signaling receptor with intracellular calcium-binding EF-hand domains.

References

    1. Abrahamsen, M.S., Templeton, T.J., Enomoto, S., Abrahante, J.E., Zhu, G., Lancto, C.A., Deng, M., Liu, C., Widmer, G., Tzipori, Z., et al. 2004. The complete genome sequence of the apicomplexan, Cryptosporidium parvum. Science 304: 441-445. - PubMed
    1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389-3402. - PMC - PubMed
    1. Aravind, L. and Iyer, L.M. 2002. The SWIRM domain: A conserved module found in chromosomal proteins points to novel chromatin-modifying activities. Genome Biol. 3: RESEARCH0039. - PMC - PubMed
    1. Aravind, L. and Subramanian, G. 1999. Origin of multicellular eukaryotes—Insights from proteome comparisons. Curr. Opin. Genet. Dev. 9: 688-694. - PubMed
    1. Aravind, L., Watanabe, H., Lipman, D.J., and Koonin, E.V. 2000. Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc. Natl. Acad. Sci. 97: 11319-11324. - PMC - PubMed

WEB SITE REFERENCES

    1. ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.txt; BLASTCLUST.
    1. http://134.84.110.219/cgi-bin/gbrowse/crypto909; C. parvum genome sequence information and annotation.
    1. http://bioweb.pasteur.fr/seqanal/interfaces/toppred.html; TOPRED1.0.
    1. http://www.cbs.dtu.dk/services/SignalP-2.0/; SIGNALP.
    1. http://www.cryptodb.org; C. parvum genome sequence information and annotation.

Publication types

MeSH terms

LinkOut - more resources