Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Jul 23:2011:bar030.
doi: 10.1093/database/bar030. Print 2011.

Ensembl BioMarts: a hub for data retrieval across taxonomic space

Affiliations

Ensembl BioMarts: a hub for data retrieval across taxonomic space

Rhoda J Kinsella et al. Database (Oxford). .

Abstract

For a number of years the BioMart data warehousing system has proven to be a valuable resource for scientists seeking a fast and versatile means of accessing the growing volume of genomic data provided by the Ensembl project. The launch of the Ensembl Genomes project in 2009 complemented the Ensembl project by utilizing the same visualization, interactive and programming tools to provide users with a means for accessing genome data from a further five domains: protists, bacteria, metazoa, plants and fungi. The Ensembl and Ensembl Genomes BioMarts provide a point of access to the high-quality gene annotation, variation data, functional and regulatory annotation and evolutionary relationships from genomes spanning the taxonomic space. This article aims to give a comprehensive overview of the Ensembl and Ensembl Genomes BioMarts as well as some useful examples and a description of current data content and future objectives. Database URLs: http://www.ensembl.org/biomart/martview/; http://metazoa.ensembl.org/biomart/martview/; http://plants.ensembl.org/biomart/martview/; http://protists.ensembl.org/biomart/martview/; http://fungi.ensembl.org/biomart/martview/; http://bacteria.ensembl.org/biomart/martview/.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
There are 777 Ensembl protein coding genes that code for the GPCR domain with InterPro ID (IPR000276) and that are detectable with the Affy HuGene 1_0 st v1 array 25.
Figure 2.
Figure 2.
The esv263 structural variation from DGVa occurs between 16 265 092 and 16 446 378 bp on chromosome 12.
Figure 3.
Figure 3.
Shows that there are 100 single nucleotide polymorphisms in the human somatic variation data set associated with tumors in the eye and the list of Ensembl gene IDs containing these variations can be downloaded for further study or one can click on an entry in the Ensembl Gene ID column on the interface which links to the main Ensembl website.
Figure 4.
Figure 4.
Five dbSNP rs IDs were used to filter the human variation data set and Ensembl gene IDs containing these five variations were selected in the attributes. Then linking to the second data set, human gene data set from Ensembl Genes database, the HGNC ID and symbol were selected in the attribute section to retrieve the corresponding gene names from HGNC. They are FAN1, MTMR10 and EEF1DP3.
Figure 5.
Figure 5.
The genes in the filtered region were lacA, lacY and lacZ and we can see that there are no orthologs for the lacZ gene in the E. coli DH10B strain.
Figure 6.
Figure 6.
Having first retrieved the Ensembl gene IDs for the three APL1 genes, these are used to filter the A. gambiae data set. Fifty variations were retrieved that lie within the three genes of the APL1 locus.
Figure 7.
Figure 7.
The ability to retrieve sequence information for genes of interest is a powerful feature of the BioMart tool. Here a user can download the coding sequence for all genes on chromosome 22 as well as additional information about each gene and this can be exported in a useful format.

References

    1. Flicek P, Amode MR, Barrell D, et al. Ensembl 2011. Nucleic Acids Res. 2011;39:D800–D806. - PMC - PubMed
    1. Foelo ML, Sherry ST. NCBI dbSNP Database: content and searching. In: Weiner MP, Gabriel SB, Stephens JC, editors. Genetic Variation: A Laboratory Manual. Cold Spring Harbour, NY: Cold Spring Harbour Laboratory Press; 2007. pp. 41–61.
    1. Chen Y, Cunningham F, Rios D, et al. Ensembl variation resources. BMC Genomics. 2010;11:293. - PMC - PubMed
    1. Hunter S, Apweiler R, Attwood TK, et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 2009;37:D211–D215. - PMC - PubMed
    1. Church DM, Lappalainen I, Sneddon TP, et al. Public data archives for genomic structural variation. Nat. Genet. 2010;42:813–814. - PMC - PubMed

Publication types