Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2016 Nov 29;9(1):611.
doi: 10.1186/s13071-016-1896-4.

Comparative sequence analysis of Cyclospora cayetanensis apicoplast genomes originating from diverse geographical regions

Affiliations
Comparative Study

Comparative sequence analysis of Cyclospora cayetanensis apicoplast genomes originating from diverse geographical regions

Hediye Nese Cinar et al. Parasit Vectors. .

Abstract

Background: Cyclospora cayetanensis is an emerging coccidian parasite that causes endemic and epidemic diarrheal disease called cyclosporiasis, and this infection is associated with consumption of contaminated produce or water in developed and developing regions. Food-borne outbreaks of cyclosporiasis have occurred almost every year in the USA since the 1990s. Investigations of these outbreaks are currently hampered due to lack of molecular epidemiological tools for trace back analysis. The apicoplast of C. cayetanensis, a relict non-photosynthetic plastid with an independent genome, provides an attractive target to discover sequence polymorphisms useful as genetic markers for detection and trace back analysis of the parasite. Distinct differences in the apicoplast genomes of C. cayetanensis could be useful in designing advanced molecular methods for rapid detection and, subtyping and geographical source attribution, which would aid outbreak investigations and surveillance studies.

Methods: To obtain the genome sequence of the C. cayetanensis apicoplast, we sequenced the C. cayetanensis genomic DNA extracted from clinical stool samples, assembled and annotated a 34,146 bp-long circular sequence, and used this sequence as a reference genome in this study. We compared the genome and the predicted proteome to the data available from other apicomplexan parasites. To initialize the search for genetic markers, we mapped the raw sequence reads from an additional 11 distinct clinical stool samples originating from Nepal, New York, Texas, and Indonesia to the apicoplast reference genome.

Results: We identified several high quality single nucleotide polymorphisms (SNPs) and small insertion/deletions spanning the apicoplast genome supported by extensive sequencing reads data, and a 30 bp sequence repeat at the terminal spacer region in a Nepalese sample. The predicted proteome consists of 29 core apicomplexan peptides found in most of the apicomplexans. Cluster analysis of these C. cayetanensis apicoplast genomes revealed a familiar pattern of tight grouping with Eimeria and Toxoplasma, separated from distant species such as Plasmodium and Babesia.

Conclusions: SNPs and sequence repeats identified in this study may be useful as genetic markers for identification and differentiation of C. cayetanensis isolates found and could facilitate outbreak investigations.

Keywords: Apicoplast genome; Cyclospora cayetanensis; Genomics; Next generation sequencing.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Cyclospora cayetanensis consensus apicoplast genome. The 34,146 nucleotide apicoplast genome was annotated using MAKER 2 server for de novo gene prediction; RATT tool [51] was used to transfer annotations from E. tenella. The predicted features were manually curated using comparison of the two annotation files. Sequences of the longest fragments of two terminal inverted repeat regions: 1–5147 and 28,999–34,146 bases, are shown in purple boxes
Fig. 2
Fig. 2
Complete apicoplast annotation and identification of core apicoplast proteins. a Draft annotations from C. cayetanensis reference genome (Cc-Ref) were aligned with KP866208 (Cc-HEN01 C. cayetanensis HEN01 strain), AY217738 (E.tenel: Eimeria tenella). The annotations were then manually curated to define gene boundaries (Table 1) and to correct any anomalies (b and c) in the assemblies. The final C. cayetanensis reference genome annotations were compared with LN999985 (P.falci: Plasmodium falciparum 3D7) and predicted proteins from other apicomplexan apicoplasts available in GenBank Twenty-nine core proteins present in most of the available apicoplast genomes were identified in the C. cayetanensis reference Table 1. Core apicoplast proteins 1–29 in C. cayetanensis (track Cc-ref) are illustrated above from left to right. In Tracks a-c, arrows in rRNA CDS are in red, tRNA in purple and proteins are in blue. In Track d, only the predicted proteome is shown in dark blue. b Cyclospora homolog of RPL36 (ribosomal protein L36) was included (track Cc-Ref) between RPS5 and RPS11 in the reference genome annotation. This partial peptide homologous to an Eimeria protein (track E.tenel) is not available (indicated by a red line) from the KP866208 genome annotation file in GenBank (track Cc-HEN01). c Based on the core apicoplast proteins identified in a wide variety of apicomplexan parasites, a homolog of ORF-A protein found in C. cayetanensis reference genome (track Cc-Ref) was predicted and confirmed to be present in Eimeria (track E.tenel). Currently this CDS is missing (indicated by a red line) from the annotations of E. tenella (AY217738) and other eimeriids. Track Cc-HEN01 represents the apicoplast coding regions of strain C. cayetanensis HEN01
Fig. 3
Fig. 3
BLAST analysis of the Cyclospora apicoplast reference genome with other apicomplexans. Apicoplast genomes from an eimeriid, E. tenella and a sarcocystid, T. gondii (brown and green bars, respectively) and a plasmodiid, P. falciparum 3D7 (violet) are compared with the C. cayetanensis apicoplast (outer circles with purple, blue and maroon bands) using CGview software with its built-in Blast tool with the default e-value cutoff of 0.1. The thickness of the bands indicates sequence similarity. The eimeriid and sarcocystid apicoplasts have higher nucleotide similarity while the Plasmodium apicoplast is least similar of the three recapitulating evolutionary distance between these species. The predicted proteomes of these divergent apicomplexans contain mostly conserved core gene content (Fig. 2a)
Fig. 4
Fig. 4
Whole genome alignment of C. cayetanensis apicoplast with 12 genomes from apicomplexan genera: Babesia, Eimeria, Leucocytozoon, Neospora, Plasmodium, Toxoplasma, Theileria and Sarcocystis. Whole genome alignment of (Track Cc-Ref) C. cayetanensis apicoplast reference genome with 12 assemblies from apicomplexans of the order Aconoidasida: (B.bovis) Babesia bovis; (B.micro) B.microti; (B.orien) B. orientalis Wuhan; (L. caull) Leucocytozoon caulleyi; (P.chaub) Plasmodium chaubaudi chaubaudi; (P.falci) P. falciparum HB3; (T.parva) Theileria parva and the order Conoidasida: (E.tenel) E. tenella; (Cc-HEN01) C. cayatenensis HEN01; (T.gondi) T.gondii; (S.neuro) S. neurona; and (N.canin) N. caninum. The GC content of the apicoplast genome from each species is displayed at the end of respective track. In addition, the apicoplast assemblies revealed re-arrangements in some species. Nucleotide divergence between members of Aconoidasida and Conoidasida is illustrated here
Fig. 5
Fig. 5
A 30 bp tandem repeat inserted into the terminal spacer sequences of NF1, a sample from Nepal. Dashed vertical red line represents the tail to head connection in the apicolast genome sequence. Multiple alignments of raw MiSeq reads are shown in black. Inserted bases are shown in the blue box with flanking end “ATT” and start “CAAA…..ATA” sequences. Typical terminal sequence stretch seen in Reference Genome is shown at the bottom of the illustration
Fig. 6
Fig. 6
Phylogenetic analysis of 11 C. cayetanensis apicoplast genomes from seven geographical regions compared with the reference genome. The percentage of replicate trees generated using Neighbor-Joining method in which the associated taxa clustered together in the bootstrap test (500 replicates) is shown next to the branches. Evolutionary analyses were conducted in MEGA7 suite [30]

References

    1. Fast NM, Xue L, Bingham S, Keeling PJ. Re-examining alveolate evolution using multiple protein molecular phylogenies. J Eukaryot Microbiol. 2002;49:30–37. doi: 10.1111/j.1550-7408.2002.tb00336.x. - DOI - PubMed
    1. Adl SM, Leander BS, Simpson AG, Archibald JM, Anderson OR, Bass D, et al. Diversity, nomenclature, and taxonomy of protists. Syst Biol. 2007;56:684–689. doi: 10.1080/10635150701494127. - DOI - PubMed
    1. Sato S. The apicomplexan plastid and its evolution. Cell Mol Life Sci. 2011;68:1285–1296. doi: 10.1007/s00018-011-0646-1. - DOI - PMC - PubMed
    1. Ortega YR, Sterling CR, Gilman RH, Cama VA, Diaz F. Cyclospora species - a new protozoan pathogen of humans. N Engl J Med. 1993;328:1308–12. doi: 10.1056/NEJM199305063281804. - DOI - PubMed
    1. Ortega YR, Gilman RH, Sterling CR. A new coccidian parasite (Apicomplexa: Eimeriidae) from humans. J Parasitol. 1994;80:625–629. doi: 10.2307/3283201. - DOI - PubMed

Publication types