Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Sep 15;358(2):433-438.
doi: 10.1016/j.yexcr.2016.12.014. Epub 2016 Dec 23.

The value of new genome references

Affiliations
Review

The value of new genome references

Kim C Worley et al. Exp Cell Res. .

Abstract

Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology.

Keywords: Gene family analysis; Genome assembly; Segmental duplications; Sequencing.

PubMed Disclaimer

Figures

Figure 1
Figure 1
This figure shows the challenges of creating complete genome representations. The relative sizes of different genomes, genome assembly contigs, and sequencing technologies are shown with the logarithmic scale across the bottom. At the top in blue are the sizes of genomes for different clades of organisms. Vertebrate genome sizes vary over 5 orders of magnitude, while mammalian genome sizes are more similarly sized (~3Gb). In the middle, in red, are the size ranges for the contigs (measure by contig N50) for current sequencing technologies (Illumina next generation or NGS assemblies, NGS assemblies with PacBio improvement, and PacBio de novo assemblies). The lengths of sequence reads, mate-pair distances and mapping fragments of different technologies are shown in the green bars at the bottom.

References

    1. Gordon D, Huddleston J, Chaisson MJ, Hill CM, Kronenberg ZN, Munson KM, Malig M, Raja A, Fiddes I, Hillier LW, et al. Long-read sequence assembly of the gorilla genome. Science. 2016;352(6281):aae0344. - PMC - PubMed
    1. Schneider VA, Lindsay TG, Howe K, Bouk N, Chen H-C, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. 2016 - PMC - PubMed
    1. Steinberg KM, Schneider VA, Graves-Lindsay TA, Fulton RS, Agarwala R, Huddleston J, Shiryev SA, Morgulis A, Surti U, Warren WC, et al. Single haplotype assembly of the human genome from a hydatidiform mole. Genome Res. 2014;24(12):2066–2076. - PMC - PubMed
    1. Church DM, Schneider VA, Steinberg KM, Schatz MC, Quinlan AR, Chin CS, Kitts PA, Aken B, Marth GT, Hoffman MM, et al. Extending reference assembly models. Genome Biol. 2015;16:13. - PMC - PubMed
    1. Fortna A, Kim Y, MacLaren E, Marshall K, Hahn G, Meltesen L, Brenton M, Hink R, Burgers S, Hernandez-Boussard T, et al. Lineage-specific gene duplication and loss in human and great ape evolution. PLoS Biol. 2004;2(7):E207. - PMC - PubMed

Publication types