Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 26:23:e00115.
doi: 10.1016/j.fawpar.2021.e00115. eCollection 2021 Jun.

CryptoGenotyper: A new bioinformatics tool for rapid Cryptosporidium identification

Affiliations

CryptoGenotyper: A new bioinformatics tool for rapid Cryptosporidium identification

Christine A Yanta et al. Food Waterborne Parasitol. .

Abstract

Cryptosporidium is a protozoan parasite that is transmitted to both humans and animals through zoonotic or anthroponotic means. When a host is infected with this parasite, it causes a gastrointestinal disease known as cryptosporidiosis. To understand the transmission dynamics of Cryptosporidium, the small subunit (SSU or 18S) rRNA and gp60 genes are commonly studied through PCR analysis and conventional Sanger sequencing. However, analyzing sequence chromatograms manually is both time consuming and prone to human error, especially in the presence of poorly resolved, heterozygous peaks and the absence of a validated database. For this study, we developed a Cryptosporidium genotyping tool, called CryptoGenotyper, which has the capability to read raw Sanger sequencing data for the two common Cryptosporidium gene targets (SSU rRNA and gp60) and classify the sequence data into standard nomenclature. The CryptoGenotyper has the capacity to perform quality control and properly classify sequences using a high quality, manually curated reference database, saving users' time and removing bias during data analysis. The incorporated heterozygous base calling algorithms for the SSU rRNA gene target resolves double peaks, therefore recovering data previously classified as inconclusive. The CryptoGenotyper successfully genotyped 99.3% (428/431) and 95.1% (154/162) of SSU rRNA chromatograms containing single and mixed sequences, respectively, and correctly subtyped 95.6% (947/991) of gp60 chromatograms without manual intervention. This new, user-friendly tool can provide both fast and reproducible analyses of Sanger sequencing data for the two most common Cryptosporidium gene targets.

Keywords: Genotyping tool; Mixed infections; SSU rRNA gene; Sanger sequencing; Validated database; gp60 gene.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Unlabelled Image
Graphical abstract
Fig. 1
Fig. 1
CryptoGenotyper schematic workflow. The program begins with the user inputting the gene target, sample chromatograms and database (optional). (a) If chromatograms correspond to the gp60 gene target, the sequence is retrieved by analyzing the fluorescent channel intensities. A homology search is performed against the reference database and the repeat region is calculated. (b) If chromatograms correspond to the SSU rRNA gene, the sequence is computed based on the log ratio of intensity and converted to IUPAC based nucleotide code where double peaks appear. Afterwards, the sequence is decomposed with Indelligent (Dmitriev and Rakitov, 2008) and a homology search is performed using BLAST against the reference database. If mixed sequences are determined, they are classified by following the protocol outlined in Chang et al. (2012) for determining the most possible variances (MPVs) and optimal combinations. For both markers, the sequence and species and/or subtype information is outputted.
Fig. 2
Fig. 2
Graphical user interface of Galaxy tool implementation. The user must upload Sanger sequence reads (.ab1) using the Get Data feature under Galaxy Tools. The sequence names will appear in the history on the right. To build a contig, forward and reverse reads must be inputted as a dataset pair. For multiple samples to be processed at once, the reads must be inputted as a list. Then the following must be selected: (A) the gene marker (SSU rRNA or gp60), (B) reference database (default or custom) and (C) type of sequences (forward only, reverse only, or forward and reverse). Afterwards the appropriate sequencing files (D) must be selected. When all inputted information is entered, the execute button (E) will launch the analysis. Final typing results appear in the History (F) as two entries corresponding to extracted FASTA sequence(s) and tab-delimited text report file for easy reporting. Workflows have been created to concatenate results from multiple samples available at https://github.com/phac-nml/CryptoGenotyper. The Galaxy tool implementation can be accessed at https://usegalaxy.eu/.
Fig. 3
Fig. 3
The CryptoGenotyper FASTA output file. One of the results file the CryptoGenotyper generates is a FASTA (.fa) file. For both gene target analyses, a header is outputted at the beginning of the file indicating the run parameters (reference file, program mode, forward and reverse primer names). (A) For the SSU rRNA gene target analysis, the sample name and species identified along with its corresponding sequence are outputted. (B) For the gp60 gene target analysis, the output consists of the same name, species, and subtype, followed by the sequence. This file is designed to allow the user to input it directly into BLAST for further analysis, if desired.
Fig. 4
Fig. 4
The CryptoGenotyper tab-delimited (.txt) output file. The CryptoGenotyper also generates a text file (.txt) that is tab-delimited with each analysis. (A) For the SSU rRNA gene target analysis, the sample name, analysis mode (forward, reverse, contig), whether the chromatogram had mixed sequences detected, species, sequence, comments, and the BLAST statistics (bit score, query length, query coverage, e-value, percent identity, and accession number of the nearest BLAST hit) is recorded. (B) For the gp60 gene target analysis, the sample name, analysis mode (forward, reverse, contig), species, subtype, sequence, comments, average Phred quality of the chromatograms and the BLAST statistics (similar to the SSU rRNA described) are outputted.
Fig. 5
Fig. 5
Heterozygous peaks due to Mixed SSU rRNA populations. Overlapping peaks are present throughout the SSU rRNA region, indicating mixed populations. (A) C. parvum and C. canis mixed infection. (B) Reverse Sanger sequence chromatogram representing the variant copies of the Type A and Type B (TGA polymorphism) SSU rRNA gene in a C. parvum isolate. (C) Forward Sanger sequence chromatogram representing the variant copies of the SSU rRNA gene in a C. hominis isolate with varying numbers of thymines.

References

    1. Åberg M., Emanuelson U., Troell K., Björkman C. Infection dynamics of Cryptosporidium bovis and Cryptosporidium ryanae in a Swedish dairy herd. Vet. Parasitol. X. 2019;1:100010. doi: 10.1016/j.vpoa.2019.100010. - DOI - PMC - PubMed
    1. Åberg M., Emanuelson U., Troell K., Camilla Björkman C. A single-cohort study of Cryptosporidium bovis and Cryptosporidium ryanae in dairy cattle from birth to calving. Vet. Parasit. Reg. Stud. Rep. 2020;20:100400. doi: 10.1016/j.vprsr.2020.100400. - DOI - PubMed
    1. Afgan E., Baker D., Batut B., van den Beek M., Bouvier D., Čech M., Chilton J., Clements D., Coraor N., Grüning B., Guerler A., Hillman-Jackson J., Hiltemann S., Jalili V., Rasche H., Soranzo N., Goecks J., Taylor J., Nekrutenko A., Blankenberg D. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Res. 2018;46(W1):W537–W544. doi: 10.1093/nar/gky379. - DOI - PMC - PubMed
    1. Alsmark C., Nolskog P., Lindqvist Angervall A., Toepfer M., Winiecka-Krusnell J., Bouwmeester J., Bjelkmar P., Troell K., Lahti E., Beser J. Two outbreaks of cryptosporidiosis associated with cattle spring pasture events. Vet. Parasit. Reg. Stud. Rep. 2018;14:71–74. doi: 10.1016/j.vprsr.2018.09.003. - DOI - PubMed
    1. Alves M., Xiao L., Sulaiman I., Lal A.A., Matos O., Antunes F. Subgenotype analysis of Cryptosporidium isolates from humans, cattle, and zoo ruminants in Portugal. J. Clin. Microbiol. 2003;41(6):2744–2747. doi: 10.1128/jcm.41.6.2744-2747.2003. - DOI - PMC - PubMed