Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 5;52(D1):D808-D816.
doi: 10.1093/nar/gkad1003.

VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center in 2023

Affiliations

VEuPathDB: the eukaryotic pathogen, vector and host bioinformatics resource center in 2023

Jorge Alvarez-Jarreta et al. Nucleic Acids Res. .

Abstract

The Eukaryotic Pathogen, Vector and Host Informatics Resource (VEuPathDB, https://veupathdb.org) is a Bioinformatics Resource Center funded by the National Institutes of Health with additional funding from the Wellcome Trust. VEuPathDB supports >600 organisms that comprise invertebrate vectors, eukaryotic pathogens (protists and fungi) and relevant free-living or non-pathogenic species or hosts. Since 2004, VEuPathDB has analyzed omics data from the public domain using contemporary bioinformatic workflows, including orthology predictions via OrthoMCL, and integrated the analysis results with analysis tools, visualizations, and advanced search capabilities. The unique data mining platform coupled with >3000 pre-analyzed data sets facilitates the exploration of pertinent omics data in support of hypothesis driven research. Comparisons are easily made across data sets, data types and organisms. A Galaxy workspace offers the opportunity for the analysis of private large-scale datasets and for porting to VEuPathDB for comparisons with integrated data. The MapVEu tool provides a platform for exploration of spatially resolved data such as vector surveillance and insecticide resistance monitoring. To address the growing body of omics data and advances in laboratory techniques, VEuPathDB has added several new data types, searches and features, improved the Galaxy workspace environment, redesigned the MapVEu interface and updated the infrastructure to accommodate these changes.

PubMed Disclaimer

Figures

Graphical Abstract
Graphical Abstract
Figure 1.
Figure 1.
Single-cell Transcriptomics: a new data type supported in VEuPathDB. (A) CELLXGENE application for the visualization and analysis of scRNA-Seq data currently displaying data for RACK1, PBANAKA_0703900. (B) Interactive metadata display for exploring subsets based on experimental parameters. The cluster name category displays expression profiles for RACK1 in each sample. Check boxes are used for configuring the UMAP display and/or defining subgroups for differential expression analysis. (C) Tool set for subgroup selection and display. Center panel of the CELLXGENE app with interactive tools used for subgroup selection, initiating differential expression analysis or manipulating the UMAP cluster images displayed below the tool set. (D) The Genes by Single Cell RNA-Seq Evidence search returns all genes with data in a particular scRNA-Seq data set. Search results for the Genes by Single Cell RNA-Seq Evidence provide data columns for easy access to the CELLXGENE app (E) and the scRNA-Seq section of the gene's record page (F). (G) Multi-step strategy to investigate the correlation between bulk RNA-Seq in Plasmodium vivax with scRNA-Seq data in Plasmodium berghei. The full strategy can be found at https://plasmodb.org/plasmo/app/workspace/strategies/import/5f64cd2e2b16b8aa. (H) The first column of the search result table provides a link for easy access to gene pages. (I) The results of the differential expression configured in C are displayed in the Gene Sets panel and, in this case, reveal other known liver-specific genes.
Figure 2.
Figure 2.
AlphaFold Structural Predictions: a new data type supported in VEuPathDB. (A) The Genes by Text search used to begin a two-step strategy to reveal genes involved in fungal filamentation. (B) Strategy graphic and result table. The Step 1 text search for ‘filament*’ includes the astrisk to broaden the query term to include plurals or compound words that begin with filament, e.g. filamentation, filaments. The text search results are intersected with the Genes by AlphaFold Predictions search to find 788 gens from the text search that have AlphaFold data. (C) Many genes returned by the two-step strategy lack specific gene product information and are classified as ‘unspecified product’. Gene page header (D) and AlphaFold data section (E) for the gene CTMYA2_05 600 which is classified as an unspecified product. (E) The AlphaFold structure prediction data support the identity of CTMYA2_05 600 as a possible Carbon catabolite derepressing protein kinase, which play a role in filamentation. The full strategy can be found at https://fungidb.org/fungidb/app/workspace/strategies/import/19cde8976530fcde.
Figure 3.
Figure 3.
My Organism Preferences: a new tool for configuring menu displays. (A) The tool allows users to limit menu display to information pertaining to only their organisms of interest. Available from the header (B) of any page, the tool allows users to choose their preferred organisms from the tree of all organisms in the site. Choices made in the full tree are reflected on the right side of the tool. Once choices are applied (C), the site menus are limited to the user's organism preferences. (D) The tool is easily disabled with a toggle switch in the header.
Figure 4.
Figure 4.
The MapVEu web application and exploratory data analysis (EDA) platform facilitates access to and exploration of geospatial data in VectorBase. Features of the MapVEu platform include a full-screen map with semantic zooming (A), wherein clicking on a marker causes it to zoom in and disaggregate. The user interface features a menu (B) that can be used to configure the map with markers to visualize any of the variables in the dataset (B), to filter continuous and categorical data points (C), and to make floating plots for data currently visible on the map (D). Shown here are donut markers for vector species, with the analysis restricted to records containing data for the prevalence of Kdr L1014F mutation associated with insecticide resistance, and a supporting plot visualizing trends in prevalence of Kdr 1014F mutation over a 20-year period for each of the selected species. All data can be downloaded as customizable flat files (E), and analyses can be shared (F).

References

    1. Sayers E.W., Bolton E.E., Brister J.R., Canese K., Chan J., Comeau D.C., Connor R., Funk K., Kelly C., Kim S.et al. .. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2022; 50:D20–D26. - PMC - PubMed
    1. Paysan-Lafosse T., Blum M., Chuguransky S., Grego T., Pinto B.L., Salazar G.A., Bileschi M.L., Bork P., Bridge A., Colwell L.et al. .. InterPro in 2022. Nucleic Acids Res. 2023; 51:D418–D427. - PMC - PubMed
    1. Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T.et al. .. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 2000; 25:25–29. - PMC - PubMed
    1. Ontology Consortium G., Aleksander S.A., Balhoff J., Carbon S., Cherry J.M., Drabkin H.J., Ebert D., Feuermann M., Gaudet P., Harris N.L.et al. .. The Gene Ontology knowledgebase in 2023. Genetics. 2023; 224:iyad031. - PMC - PubMed
    1. Galaxy Community The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update. Nucleic Acids Res. 2022; 50:W345–W351. - PMC - PubMed