Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
- PMID: 27153593
- PMCID: PMC4937194
- DOI: 10.1093/bioinformatics/btw152
Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences
Abstract
Motivation: Single Molecule Real-Time (SMRT) sequencing technology and Oxford Nanopore technologies (ONT) produce reads over 10 kb in length, which have enabled high-quality genome assembly at an affordable cost. However, at present, long reads have an error rate as high as 10-15%. Complex and computationally intensive pipelines are required to assemble such reads.
Results: We present a new mapper, minimap and a de novo assembler, miniasm, for efficiently mapping and assembling SMRT and ONT reads without an error correction stage. They can often assemble a sequencing run of bacterial data into a single contig in a few minutes, and assemble 45-fold Caenorhabditis elegans data in 9 min, orders of magnitude faster than the existing pipelines, though the consensus sequence error rate is as high as raw reads. We also introduce a pairwise read mapping format and a graphical fragment assembly format, and demonstrate the interoperability between ours and current tools.
Availability and implementation: https://github.com/lh3/minimap and https://github.com/lh3/miniasm
Contact: hengli@broadinstitute.org
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Figures


References
-
- Berlin K. et al. (2015) Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol., 33, 623–630. - PubMed
-
- Brankovic L. et al. (2015) Linear-time superbubble identification algorithm for genome assembly. Theor. Comput. Sci, 609, 374–383.
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources