Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jan 4;45(D1):D581-D591.
doi: 10.1093/nar/gkw1105. Epub 2016 Nov 29.

EuPathDB: the eukaryotic pathogen genomics database resource

Affiliations

EuPathDB: the eukaryotic pathogen genomics database resource

Cristina Aurrecoechea et al. Nucleic Acids Res. .

Abstract

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host-pathogen interactions.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
PlasmoDB strategy showing graphical interface for exploring relationships across data sets, data types and organisms. (The strategy can be found here: http://plasmodb.org/plasmo/im.do?s=7b88206dd42007c8) (A) Home page bubble for choosing the first search of a strategy, showing the ‘Predicted Signal Peptide’ search categorized under ‘Protein targeting and localization’. Clicking on the search title opens a form where users are prompted to choose required parameter values (if any) and initiate the search. The results of this search are displayed in Step 1 of panel C. (B) Interface for choosing subsequent searches. To add the Ribosomal profiling search that is based on RNA Seq data, users navigate the interface through ‘Run a new search for’, ‘Genes’, ‘Transcriptomics’, ‘RNA Seq Evidence’. Alternatively, to transform a result in to orthologs of another species as in step 3 of the strategy, users choose ‘Transform by Orthology’ (green arrow) instead of the navigation indicated above. (C) Three-step strategy that returns P. vivax orthologs (Step 3) of P. falciparum genes that are likely translated in merozoites (step 2) and that are predicted to encode proteins with signal peptides. (D) Table detailing the data sets and data types interrogated in this strategy.
Figure 2.
Figure 2.
Galaxy Workspace. (A) FungiDB header showing the Analyze My Experiment (orange box) link for navigating to the EuPathDB Galaxy Workspace. (B) The EuPathDB Galaxy Workspace home page with preconfigured workflows available in the center section. Available tools are located in the left panel and the History panel showing result and data files on the right in green. The ‘Display in FungiDB’ link (black box) navigates to GBrowse with the Galaxy data file open as a data track in the user's current GBrowse session. (C) Partial workflow showing the ‘drag and drop’ function of building workflow. (D) Bigwig file displayed in FungiDB Gbrowse directly from EuPathDB Galaxy using the ‘Display in FungiDB’ (black box) link in panel B.
Figure 3.
Figure 3.
Explore transcripts and enrichment analyses. (A) PlasmoDB 2-step strategy that returns genes with signal peptides that are likely translated based on ribosomal transcriptomics data. This strategy can be found at http://plasmodb.org/plasmo/im.do?s=859df329f857438e (B) The result table contains a column of Transcript IDs. (C) When a search returns transcript subsets, the Gene Result tab will contain a statement inviting users to explore the transcript results. Clicking ‘Explore’ opens the Explore Transcripts tool. (D) The Explore Transcripts tool for viewing transcripts that did or did not meet the search criteria for the current or previous searches. Choosing an option and clicking Apply Selection will filter the strategy result and display your chosen transcripts in the Gene Result tab. (E) The Analyze Results Tab opens a new tab for your chosen enrichment analysis. (F) Gene Ontology Enrichment Analysis Tool. Analysis results appear below the parameters and include enriched terms plus P-values.
Figure 4.
Figure 4.
Redesigned Gene Page. URL for this gene page- http://plasmodb.org/plasmo/app/record/gene/PF3D7_0905700 (A) Gene IDs and product descriptions are displayed in the upper left corner with other information and links directly below. (B) ‘Shortcuts’ serve two functions. Clicking on the Shortcut's magnifying glass icon offers a larger view of the data, while clicking on the image (or its title) navigates to the data within the gene page. (C) The collapsible, interactive and searchable ‘Contents’ section reflects EDAM-based categories and remains visible/stationary while scrolling the data (D). A blue section indicator (circle) points to the currently displayed data category. The check boxes to the right of the category names can be used to hide data. (D) Data is presented in collapsible, interactive, searchable, and sortable tables that contain transcript-specific information when data can be unambiguously assigned to a transcript. (E) The ‘Transcriptomics’ table featuring expandable rows with detailed information and graphs for each data set and coverage plots for RNA sequence data sets (showing one of eight tracks to conserve space in this figure). (F) Protein features table with the same expandable structure as the Transcriptomics table and showing protein domains, BLASTP Hits, Low Complexity Regions and Secondary Structure predictions.
Figure 4.
Figure 4.
Redesigned Gene Page. URL for this gene page- http://plasmodb.org/plasmo/app/record/gene/PF3D7_0905700 (A) Gene IDs and product descriptions are displayed in the upper left corner with other information and links directly below. (B) ‘Shortcuts’ serve two functions. Clicking on the Shortcut's magnifying glass icon offers a larger view of the data, while clicking on the image (or its title) navigates to the data within the gene page. (C) The collapsible, interactive and searchable ‘Contents’ section reflects EDAM-based categories and remains visible/stationary while scrolling the data (D). A blue section indicator (circle) points to the currently displayed data category. The check boxes to the right of the category names can be used to hide data. (D) Data is presented in collapsible, interactive, searchable, and sortable tables that contain transcript-specific information when data can be unambiguously assigned to a transcript. (E) The ‘Transcriptomics’ table featuring expandable rows with detailed information and graphs for each data set and coverage plots for RNA sequence data sets (showing one of eight tracks to conserve space in this figure). (F) Protein features table with the same expandable structure as the Transcriptomics table and showing protein domains, BLASTP Hits, Low Complexity Regions and Secondary Structure predictions.
Figure 5.
Figure 5.
Filter Parameter for composing sample groups based on metadata. (A) Samples are chosen from participants age 0 to 10. The left panel displays categories of sample characteristics while the right shows details of the data for that category. A summary of the sample group characteristics appears above the panel—333 out of 421 samples are below age 10.9 (blue arrow). (B) Adding a characteristic to refine the sample group. A second characteristic is chosen from the left panel (Health Status) and the Malaria group is chosen. The summary now shows the group characteristics—263 out of 421 samples have age <10.9 and malaria health status (blue arrow).

Similar articles

  • EuPathDB: The Eukaryotic Pathogen Genomics Database Resource.
    Warrenfeltz S, Basenko EY, Crouch K, Harb OS, Kissinger JC, Roos DS, Shanmugasundram A, Silva-Franco F. Warrenfeltz S, et al. Methods Mol Biol. 2018;1757:69-113. doi: 10.1007/978-1-4939-7737-6_5. Methods Mol Biol. 2018. PMID: 29761457 Free PMC article.
  • VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center.
    Amos B, Aurrecoechea C, Barba M, Barreto A, Basenko EY, Bażant W, Belnap R, Blevins AS, Böhme U, Brestelli J, Brunk BP, Caddick M, Callan D, Campbell L, Christensen MB, Christophides GK, Crouch K, Davis K, DeBarry J, Doherty R, Duan Y, Dunn M, Falke D, Fisher S, Flicek P, Fox B, Gajria B, Giraldo-Calderón GI, Harb OS, Harper E, Hertz-Fowler C, Hickman MJ, Howington C, Hu S, Humphrey J, Iodice J, Jones A, Judkins J, Kelly SA, Kissinger JC, Kwon DK, Lamoureux K, Lawson D, Li W, Lies K, Lodha D, Long J, MacCallum RM, Maslen G, McDowell MA, Nabrzyski J, Roos DS, Rund SSC, Schulman SW, Shanmugasundram A, Sitnik V, Spruill D, Starns D, Stoeckert CJ, Tomko SS, Wang H, Warrenfeltz S, Wieck R, Wilkinson PA, Xu L, Zheng J. Amos B, et al. Nucleic Acids Res. 2022 Jan 7;50(D1):D898-D911. doi: 10.1093/nar/gkab929. Nucleic Acids Res. 2022. PMID: 34718728 Free PMC article.
  • Accessing Cryptosporidium Omic and Isolate Data via CryptoDB.org.
    Warrenfeltz S, Kissinger JC; EuPathDB Team. Warrenfeltz S, et al. Methods Mol Biol. 2020;2052:139-192. doi: 10.1007/978-1-4939-9748-0_10. Methods Mol Biol. 2020. PMID: 31452162
  • Using CellMiner 1.6 for Systems Pharmacology and Genomic Analysis of the NCI-60.
    Reinhold WC, Sunshine M, Varma S, Doroshow JH, Pommier Y. Reinhold WC, et al. Clin Cancer Res. 2015 Sep 1;21(17):3841-52. doi: 10.1158/1078-0432.CCR-15-0335. Epub 2015 Jun 5. Clin Cancer Res. 2015. PMID: 26048278 Free PMC article. Review.
  • Protein Bioinformatics Databases and Resources.
    Chen C, Huang H, Wu CH. Chen C, et al. Methods Mol Biol. 2017;1558:3-39. doi: 10.1007/978-1-4939-6783-4_1. Methods Mol Biol. 2017. PMID: 28150231 Free PMC article. Review.

Cited by

References

    1. Fischer S., Aurrecoechea C., Brunk B.P., Gao X., Harb O.S., Kraemer E.T., Pennington C., Treatman C., Kissinger J.C., Roos D.S., et al. The Strategies WDK: a graphical search interface and web development kit for functional genomics databases. Database (Oxford) 2011;2011:bar027. - PMC - PubMed
    1. Aurrecoechea C., Barreto A., Brestelli J., Brunk B.P., Cade S., Doherty R., Fischer S., Gajria B., Gao X., Gingle A., et al. EuPathDB: the eukaryotic pathogen database. Nucleic Acids Res. 2013;41:D684–D691. - PMC - PubMed
    1. Wattam A.R., Abraham D., Dalay O., Disz T.L., Driscoll T., Gabbard J.L., Gillespie J.J., Gough R., Hix D., Kenyon R., et al. PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res. 2014;42:D581–D591. - PMC - PubMed
    1. Giraldo-Calderón G.I., Emrich S.J., MacCallum R.M., Maslen G., Dialynas E., Topalis P., Ho N., Gesing S., Madey G., VectorBase Consortium et al. VectorBase: an updated bioinformatics resource for invertebrate vectors and other organisms related with human diseases. Nucleic Acids Res. 2015;43:D707–D713. - PMC - PubMed
    1. Pickett B.E., Greer D.S., Zhang Y., Stewart L., Zhou L., Sun G., Gu Z., Kumar S., Zaremba S., Larsen C.N., et al. Virus pathogen database and analysis resource (ViPR): a comprehensive bioinformatics database and analysis resource for the coronavirus research community. Viruses. 2012;4:3209–3226. - PMC - PubMed

Publication types

LinkOut - more resources