Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan;43(Database issue):D1107-12.
doi: 10.1093/nar/gku990. Epub 2014 Nov 6.

diArk--the database for eukaryotic genome and transcriptome assemblies in 2014

Affiliations

diArk--the database for eukaryotic genome and transcriptome assemblies in 2014

Martin Kollmar et al. Nucleic Acids Res. 2015 Jan.

Abstract

Eukaryotic genomes are the basis for understanding the complexity of life from populations to the molecular level. Recent technological innovations have revolutionized the speed of data generation enabling the sequencing of eukaryotic genomes and transcriptomes within days. The database diArk (http://www.diark.org) has been developed with the aim to provide access to all available assembled genomes and transcriptomes. In September 2014, diArk contains about 2600 eukaryotes with 6000 genome and transcriptome assemblies, of which 22% are not available via NCBI/ENA/DDBJ. Several indicators for the quality of the assemblies are provided to facilitate their comparison for selecting the most appropriate dataset for further studies. diArk has a user-friendly web interface with extensive options for filtering and browsing the sequenced eukaryotes. In this new version of the database we have also integrated species, for which transcriptome assemblies are available, and we provide more analyses of assemblies.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Representation of the nonredundant species (i.e. one strain per species) in diArk with their sequencing type and method. For comparison, all species are marked, for which transcriptome data and/or genome assemblies are available via NCBI/ENA/DDBJ. Nine hundred and eighty five of the assemblies have been published but only 784 of them are linked to the genome assemblies at NCBI.
Figure 2.
Figure 2.
Evolution of the fraction of nonredundant species compared to all sequenced species over time.
Figure 3.
Figure 3.
Distribution of species, for which EST/cDNA data, genome assemblies and transcriptome assemblies are available. For each sequencing type, the pie charts show the percentage of sequenced species for selected taxa.

References

    1. Genome 10K Community of Scientists. Genome 10K: a proposal to obtain whole-genome sequence for 10000 vertebrate species. J. Hered. 2009;100:659–674. - PMC - PubMed
    1. i5K Consortium. The i5K initiative: advancing arthropod genomics for knowledge, human health, agriculture, and the environment. J. Hered. 2013;104:595–600. - PMC - PubMed
    1. Kumar S., Schiffer P.H., Blaxter M. 959 nematode genomes: a semantic wiki for coordinating sequencing projects. Nucleic Acids Res. 2012;40:D1295–D1300. - PMC - PubMed
    1. Weigel D., Mott R. The 1001 genomes project for Arabidopsis thaliana. Genome Biol. 2009;10:107. - PMC - PubMed
    1. Daetwyler H.D., Capitan A., Pausch H., Stothard P., van Binsbergen R., Brøndum R.F., Liao X., Djari A., Rodriguez S.C., Grohs C., et al. Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle. Nat. Genet. 2014;46:858–865. - PubMed

Publication types