Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 4;46(D1):D1128-D1136.
doi: 10.1093/nar/gkx907.

BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery

Affiliations

BioMuta and BioXpress: mutation and expression knowledgebases for cancer biomarker discovery

Hayley M Dingerdissen et al. Nucleic Acids Res. .

Abstract

Single-nucleotide variation and gene expression of disease samples represent important resources for biomarker discovery. Many databases have been built to host and make available such data to the community, but these databases are frequently limited in scope and/or content. BioMuta, a database of cancer-associated single-nucleotide variations, and BioXpress, a database of cancer-associated differentially expressed genes and microRNAs, differ from other disease-associated variation and expression databases primarily through the aggregation of data across many studies into a single source with a unified representation and annotation of functional attributes. Early versions of these resources were initiated by pilot funding for specific research applications, but newly awarded funds have enabled hardening of these databases to production-level quality and will allow for sustained development of these resources for the next few years. Because both resources were developed using a similar methodology of integration, curation, unification, and annotation, we present BioMuta and BioXpress as allied databases that will facilitate a more comprehensive view of gene associations in cancer. BioMuta and BioXpress are hosted on the High-performance Integrated Virtual Environment (HIVE) server at the George Washington University at https://hive.biochemistry.gwu.edu/biomuta and https://hive.biochemistry.gwu.edu/bioxpress, respectively.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The development pipelines of both BioMuta and BioXpress share several common features, including primary data sources, integration and ID mapping approaches, unification and interface design. Several sources supply primary data including variation, expression, annotation and ontology/identifier data. In the ‘Data Retrieval’ portion of the figure, the sources to the far left represent those data sources used only for BioMuta. Similarly, the sources to the far right are those used only for BioXpress. The sources in the middle (between the dashed gray lines) are datasets or sources that contribute data to both BioMuta and BioXpress. Throughout data processing, a number of quality control (QC) steps are imposed to ensure integrity and accuracy of data, where possible. Processed data are unified by cancer type to the corresponding DOID(s) and entered into MySQL database to be searchable by query on the web interfaces. * Due to the number of primary data sources, those resources supplying only functional annotations are not included in the figure above. Sources for functional annotations not pictured include: CDD, SysPTM, PhosphoSite, Phospho.ELM, dbSNO, HPRD and OGlycBase6.0. Additional annotations are supplied following analysis by Polyphen and NetNGlyc.
Figure 2.
Figure 2.
Interface of BioMuta. Two types of search engines are included in the interface of BioMuta (basic search for any gene name or accession, and advanced search for combined search of up to four search terms). After clicking ‘search,’ an interactive interim results page populates showing all possible genes matching the search criteria. After users select and click on the UniProtKB ID of their preferred gene, the detailed results page loads, displaying a figure and a table with information about all related SNVs and cancer types for this gene. APIs are integrated and users can obtain search results in JSON format by providing specific URLs. In addition to the search function, whole BioMuta datasets are downloadable, available from the tool home and archive pages.
Figure 3.
Figure 3.
Interface of BioXpress. Two types of search engines are included in the interface of BioXpress (search by individual gene/miRNA, and search by cancer type). For both searches, the results page for a queried gene/miRNA loads after clicking the ‘search’ button. In this page, a figure and a table displays information about either the expression trend of a single gene/miRNA in cancer patients (for gene/miRNA search) or the expression trend of the top 20 transcripts in one particular cancer type (for cancer type search), along with the results from differential expression analysis. APIs, whole dataset downloads, and documentation are also available.

References

    1. Welter D., MacArthur J., Morales J., Burdett T., Hall P., Junkins H., Klemm A., Flicek P., Manolio T., Hindorff L. et al. . The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014; 42:D1001–D1006. - PMC - PubMed
    1. Collins F.S., Guyer M.S., Charkravarti A.. Variations on a theme: cataloging human DNA sequence variation. Science. 1997; 278:1580–1581. - PubMed
    1. Ohashi J., Tokunaga K.. The power of genome-wide association studies of complex disease genes: statistical limitations of indirect approaches using SNP markers. J. Hum. Genet. 2001; 46:478–482. - PubMed
    1. Makowsky R., Pajewski N.M., Klimentidis Y.C., Vazquez A.I., Duarte C.W., Allison D.B., de los Campos G.. Beyond missing heritability: prediction of complex traits. PLoS Genet. 2011; 7:e1002051. - PMC - PubMed
    1. Speed D., Balding D.J.. MultiBLUP: improved SNP-based prediction for complex traits. Genome Res. 2014; 24:1550–1557. - PMC - PubMed

Publication types

LinkOut - more resources