Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Aug 15;32(16):2508-10.
doi: 10.1093/bioinformatics/btw159. Epub 2016 Apr 7.

gEVAL - a web-based browser for evaluating genome assemblies

Affiliations

gEVAL - a web-based browser for evaluating genome assemblies

William Chow et al. Bioinformatics. .

Abstract

Motivation: For most research approaches, genome analyses are dependent on the existence of a high quality genome reference assembly. However, the local accuracy of an assembly remains difficult to assess and improve. The gEVAL browser allows the user to interrogate an assembly in any region of the genome by comparing it to different datasets and evaluating the concordance. These analyses include: a wide variety of sequence alignments, comparative analyses of multiple genome assemblies, and consistency with optical and other physical maps. gEVAL highlights allelic variations, regions of low complexity, abnormal coverage, and potential sequence and assembly errors, and offers strategies for improvement. Although gEVAL focuses primarily on sequence integrity, it can also display arbitrary annotation including from Ensembl or TrackHub sources. We provide gEVAL web sites for many human, mouse, zebrafish and chicken assemblies to support the Genome Reference Consortium, and gEVAL is also downloadable to enable its use for any organism and assembly.

Availability and implementation: Web Browser: http://geval.sanger.ac.uk, Plugin: http://wchow.github.io/wtsi-geval-plugin

Contact: kj2@sanger.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Region on GRCh38 Chromosome 11 with variation and missing sequence. (A) Purple clone end pair mappings indicate same end repeated, while red mappings indicate incorrect orientation of paired ends. (B) Two clone components are used to build this region of the assembly. The green box indicates a reliable overlap region (red would indicate high variation). (C) Orange indicates an incomplete transcript mapping. (D) Six Single molecule genome maps (orange/red) can be compared to in silico digest (purple). Red regions indicate discordance. In this case, a ∼7.5 kb block variation is shared between three maps and the reference, whilst three other maps share two fragments. Furthermore, in the ∼39 kb digest block, all six maps indicate a size of ∼45–47 kb, giving evidence of missing sequence (∼7–8 kb). (E) Comparative analysis between HuRef and YH2 assemblies, reveal this missing sequence (dotted box) as well as the region of variation (Supplementary Figure S1)

References

    1. Adams D.J. et al. (2015) The Mouse Genomes Project: a repository of inbred laboratory mouse strain genomes. Mamm. Genome, 26, 403–412. - PubMed
    1. Birney E. et al. (2004) An overview of Ensembl. Genome Res., 14, 925–928. - PMC - PubMed
    1. Church D.M. et al. (2011) Modernizing reference genome assemblies. Plos Biol., 9, e1001091. - PMC - PubMed
    1. Cunningham F. et al. (2014) Ensembl 2015. Nucleic Acids Res., 43, D662–D669. - PMC - PubMed
    1. Denton J.F. et al. (2014) Extensive error in the number of genes inferred from draft genome assemblies. Plos Comput. Biol., 10, e1003998. - PMC - PubMed