Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Jun 19:6:220.
doi: 10.3389/fgene.2015.00220. eCollection 2015.

Using linkage maps to correct and scaffold de novo genome assemblies: methods, challenges, and computational tools

Affiliations
Review

Using linkage maps to correct and scaffold de novo genome assemblies: methods, challenges, and computational tools

Janna L Fierst. Front Genet. .

Abstract

Modern high-throughput DNA sequencing has made it possible to inexpensively produce genome sequences, but in practice many of these draft genomes are fragmented and incomplete. Genetic linkage maps based on recombination rates between physical markers have been used in biology for over 100 years and a linkage map, when paired with a de novo sequencing project, can resolve mis-assemblies and anchor chromosome-scale sequences. Here, I summarize the methodology behind integrating de novo assemblies and genetic linkage maps, outline the current challenges, review the available software tools, and discuss new mapping technologies.

Keywords: draft genome; next-generation sequencing; optical mapping; physical mapping; scaffolds.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) In whole genome assembly errors result from residual alleles which appear as discrete sequences in the reference, and mis-joins. Small fragments have no genomic context and contribute little information. (B) Using a genetic linkage map to anchor a de novo assembly resolves error in the reference sequence by giving small sequences genomic context, resolving allelism, and identifying mis-joins. Chromosome-scale assemblies can be constructed by ordering and orienting sequences with the linkage map. (C) A genetic linkage map can be estimated from a parental cross resulting in an F2, F3, or Backcross (here, BC1) population. Estimating a genetic linkage map requires (D) genotyping individuals at discrete markers (here, six markers across eight individuals with missing data); and (E) grouping markers into linkage groups; and ordering and spacing markers within linkage groups. Estimating order and spacing is difficult due to missing data and little recombination between adjacent markers.

References

    1. Adey A., Kitzman J. O., Burton J. N., Daza R., Kumar A., Christiansen L., et al. . (2014). In vitro, long-range sequence information for de novo genome assembly via transposase contiguity. Genome Res. 24, 2041–2049. 10.1101/gr.178319.114 - DOI - PMC - PubMed
    1. Adey A., Morrison H. G., Asan, Xun X., Kitzman J. O., Turner E. H., et al. . (2010). Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11:R119. 10.1186/gb-2010-11-12-r119 - DOI - PMC - PubMed
    1. Alkan C., Sajjadian S., Eichler E. E. (2011). Limitations of next-generation genome sequence assembly. Nat. Methods 8, 61–65. 10.1038/nmeth.1527 - DOI - PMC - PubMed
    1. Amini S., Pushkarev D., Christiansen L., Kostem E., Royce T., Turk C., et al. . (2014). Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing. Nat. Genet. 46, 1343–1349. 10.1038/ng.3119 - DOI - PMC - PubMed
    1. Baird N. A., Etter P. D., Atwood T. S., Currey M. C., Shiver A. L., Lewis Z. A., et al. . (2008). Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS ONE 3:e3376. 10.1371/journal.pone.0003376 - DOI - PMC - PubMed

LinkOut - more resources