Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 25:11:603.
doi: 10.1186/1471-2164-11-603.

A novel multifunctional oligonucleotide microarray for Toxoplasma gondii

Affiliations

A novel multifunctional oligonucleotide microarray for Toxoplasma gondii

Amit Bahl et al. BMC Genomics. .

Abstract

Background: Microarrays are invaluable tools for genome interrogation, SNP detection, and expression analysis, among other applications. Such broad capabilities would be of value to many pathogen research communities, although the development and use of genome-scale microarrays is often a costly undertaking. Therefore, effective methods for reducing unnecessary probes while maintaining or expanding functionality would be relevant to many investigators.

Results: Taking advantage of available genome sequences and annotation for Toxoplasma gondii (a pathogenic parasite responsible for illness in immunocompromised individuals) and Plasmodium falciparum (a related parasite responsible for severe human malaria), we designed a single oligonucleotide microarray capable of supporting a wide range of applications at relatively low cost, including genome-wide expression profiling for Toxoplasma, and single-nucleotide polymorphism (SNP)-based genotyping of both T. gondii and P. falciparum. Expression profiling of the three clonotypic lineages dominating T. gondii populations in North America and Europe provides a first comprehensive view of the parasite transcriptome, revealing that ~49% of all annotated genes are expressed in parasite tachyzoites (the acutely lytic stage responsible for pathogenesis) and 26% of genes are differentially expressed among strains. A novel design utilizing few probes provided high confidence genotyping, used here to resolve recombination points in the clonal progeny of sexual crosses. Recent sequencing of additional T. gondii isolates identifies >620 K new SNPs, including ~11 K that intersect with expression profiling probes, yielding additional markers for genotyping studies, and further validating the utility of a combined expression profiling/genotyping array design. Additional applications facilitating SNP and transcript discovery, alternative statistical methods for quantifying gene expression, etc. are also pursued at pilot scale to inform future array designs.

Conclusions: In addition to providing an initial global view of the T. gondii transcriptome across major lineages and permitting detailed resolution of recombination points in a historical sexual cross, the multifunctional nature of this array also allowed opportunities to exploit probes for purposes beyond their intended use, enhancing analyses. This array is in widespread use by the T. gondii research community, and several aspects of the design strategy are likely to be useful for other pathogens.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Differential expression between clonal lineages. A, MA plots (intensity ratio versus average intensity) for hybridizations with representatives of the three major clonal lineages of Toxoplasma show a very high degree of reproducibility among biological replicates (comparisons shown along diagonal), and a significant number of differentially expressed genes between lineages (blue dots). Tables list genes exhibiting the greatest differences in hybridization intensity for each pairwise comparison, ranked by estimated fold-change (asterisks indicate genes where at least four probes are polymorphic). B, Gene presence was determined using a 10% false discovery rate (see Materials & Methods for details), resulting in 42% of genes called present in RH-, 38% in Prugniaud-, and 40% in VEG-strain parasites, with an aggregate total of 49% of genes called present in any strain during the tachyzoite (lytic) life stage. C, 5,307 genes exhibit differences in between-strain expression levels at a P-value of 1 × 10-3 (corrected for multiple testing); filtering to eliminate genes for which 4 or more probes are polymorphic, differences in fold change are under 2-fold, or are called absent (at a 10% FDR) leaves 2,078 genes with clear evidence of strain-specific differential expression. The pie chart indicates the distribution of differentially regulated genes by strain (+ and - indicate up- and down-regulation, respectively).
Figure 2
Figure 2
Genotyping design. A, The chromosome map illustrates three tiers of genotyping content present on the Toxoplasma microarray. The triangles represent the published RFLP markers, and represent genotyping capabilities prior to this work. The filled triangles represent those markers for which we have provided probesets that passed a rigorous screening process. The top half of each chromosome bar represents the EST-based SNPs, and the bottom half shows the SFPs that have passed screening. The table lists the exact numbers of SNPs represented on the array, those that passed screening, and the probe content for each. B, An expanded view of chromosome Ib indicates the SNP frequency derived from comparative sequence analysis for the three archetypal strains (see Additional File 3), and indicates the location of probesets designed for SNP detection. C, A magnified view of chromosome Ib demonstrating the overlap of SNPs with probes primarily designed for transcriptional profiling. Pink triangles indicate those probes which overlap SNP locations, and can be used to detect SFPs (see text).
Figure 3
Figure 3
SNP detection performance. A, Performance of a single, sense strand probe: The ability of a single sense stranded probe overlapping a SNP to call the correct allele as a function of distance from the center of the probe to the SNP is shown. At a stringent P-value threshold of 10-4, approximately 65% of SNPs are called correctly using a probe centered exactly on the SNP (see haploid genotyping simulation section in Materials & Methods for a description of P-value calculations). B, Performance of an alternate probe when centered sense probe fails: When the centered probe fails to call the correct allele at a chosen threshold, the ability of one additional probe to rescue the call is shown as a function of strand and distance of the probe relative to the SNP. Probes on the sense strand at close distances to the SNP contribute little, presumably due to the same local constraints that caused the centered sense probe to fail, where as the opposite strand centered probe recovers 60% of missed calls at a threshold of 10-4. Therefore, at a threshold of 10-4, we achieve an 86% success rate.
Figure 4
Figure 4
Detecting crossovers. A chromosomal SNP map for a recombinant progeny (clone A6AF) of a GT1 (type I) X CTG (type III) cross is represented, along with the published (triangles) and array-based (lines) genotyping calls for this clone. There is almost total agreement between markers called by both methods (>98.5%). The inset table summarizes the benefits of mapping crossovers using the array across 5 randomly selected progeny, showing that on average more breakpoints are discovered, and cover regions that are approximately 11-fold smaller. The numbers in parentheses in the breakpoint columns represent previous results using RFLP analysis.
Figure 5
Figure 5
SNP discovery and gene expression profiling in the apicoplast. The T. gondii plastid (apicoplast; RH strain sequence) was tiled at a 25 nt resolution on alternating strands allowing probe level expression profiling across the entire organelle. Expression patterns (inner circle; red and blue bars represent opposite strands and high absolute expression; grey bars represent low expression levels) are consistent with an operon transcription, with two major origins of transcription evident at the LSU rRNA genes, running in opposite directions (as indicated by the arrows). SFPs were also uncovered using DNA hybridization differences between GT1 (type I), Pru (type II), and CTG (type III), revealing 43 type II SNPs (green diamonds), 12 type III SNPs (blue diamonds), and no type I SNPs (red diamonds).
Figure 6
Figure 6
T. gondii genes, probes, and probe-level expression profiles. Top panel shows a 50 kb region of chromosome Ib, illustrating, in addition to the 3'-biased expression profiling probes that are available genome-wide, the high density of probes available for this chromosome, including intron, exon, and antisense probes for each annotated gene (blue genes run from left to right; red from right to left). Transcript discovery probes interrogate unannotated EST clusters (≥3 ESTs) and ORFs (≥150 nt) that intersect with BLAST hits (bitscore ≥100). A barplot provides normalized probe-level expression data (union of sense probe intensities from antisense target kits, antisense probe intensities from sense kit), indicating probable expression of unannotated EST clusters and BLAST hits. See text and Table 1 for further details. Bottom panel displays a 6 kb span at higher resolution, illustrating the validation of gene structure, and comparable transcription levels in upstream ESTs that may correspond to non-coding exons.
Figure 7
Figure 7
Validation of gene models. A, At a false discovery rate of 10%, ~64% of all exons on chromosome Ib are called "Present" (see Materials & Methods for details). B, These P/A calls are highly non-random in their distribution, with multi-exon genes showing good consistency among their individual exon calls (i.e. the majority of genes show expression of most annotated exons, or no expression at all). C, Among genes with inconsistent P/A calls, expression patterns are often clustered, suggesting alternative gene models. For example, the expression patterns associated with the kinesin motor domain-containing protein (25.m01768) suggest a coding start site that begins with the eighth exon.

References

    1. Boothroyd JC, Blader I, Cleary M, Singh U. DNA microarrays in parasitology: strengths and limitations. Trends Parasitol. 2003;19:470–476. doi: 10.1016/j.pt.2003.08.002. - DOI - PubMed
    1. Duncan RC, Salotra P, Goyal N, Akopyants NS, Beverley SM, Nakhasi HL. The application of gene expression microarray technology to kinetoplastid research. Curr Mol Med. 2004;4:611–621. doi: 10.2174/1566524043360221. - DOI - PubMed
    1. Bozdech Z, Llinas M, Pulliam BL, Wong ED, Zhu J, DeRisi JL. The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum. PLoS Biol. 2003;1:E5. doi: 10.1371/journal.pbio.0000005. - DOI - PMC - PubMed
    1. Dharia NV, Sidhu AB, Cassera MB, Westenberger SJ, Bopp SE, Eastman RT, Plouffe D, Batalov S, Park DJ, Volkman SK. et al. Use of high-density tiling microarrays to identify mutations globally and elucidate mechanisms of drug resistance in Plasmodium falciparum. Genome Biol. 2009;10:R21. doi: 10.1186/gb-2009-10-2-r21. - DOI - PMC - PubMed
    1. Ganesan K, Ponmee N, Jiang L, Fowble JW, White J, Kamchonwongpaisan S, Yuthavong Y, Wilairat P, Rathod PK. A genetically hard-wired metabolic transcriptome in Plasmodium falciparum fails to mount protective responses to lethal antifolates. PLoS Pathog. 2008;4:e1000214. doi: 10.1371/journal.ppat.1000214. - DOI - PMC - PubMed

Publication types

Associated data