FLASH: fast length adjustment of short reads to improve genome assemblies
- PMID: 21903629
- PMCID: PMC3198573
- DOI: 10.1093/bioinformatics/btr507
FLASH: fast length adjustment of short reads to improve genome assemblies
Abstract
Motivation: Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome.
Results: We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds.
Availability and implementation: The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash.
Contact: t.magoc@gmail.com.
Figures







Similar articles
-
COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly.Bioinformatics. 2012 Nov 15;28(22):2870-4. doi: 10.1093/bioinformatics/bts563. Epub 2012 Oct 8. Bioinformatics. 2012. PMID: 23044551
-
PEAR: a fast and accurate Illumina Paired-End reAd mergeR.Bioinformatics. 2014 Mar 1;30(5):614-20. doi: 10.1093/bioinformatics/btt593. Epub 2013 Oct 18. Bioinformatics. 2014. PMID: 24142950 Free PMC article.
-
QuorUM: An Error Corrector for Illumina Reads.PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015. PLoS One. 2015. PMID: 26083032 Free PMC article.
-
Chromosome-level hybrid de novo genome assemblies as an attainable option for nonmodel insects.Mol Ecol Resour. 2020 Sep;20(5):1277-1293. doi: 10.1111/1755-0998.13176. Epub 2020 Jun 7. Mol Ecol Resour. 2020. PMID: 32329220 Review.
-
De novo assembly of short sequence reads.Brief Bioinform. 2010 Sep;11(5):457-72. doi: 10.1093/bib/bbq020. Epub 2010 Aug 19. Brief Bioinform. 2010. PMID: 20724458 Review.
Cited by
-
Different sources of alfalfa hay alter the composition of rumen microbiota in mid-lactation Holstein cows without affecting production performance.Front Vet Sci. 2024 Oct 21;11:1433876. doi: 10.3389/fvets.2024.1433876. eCollection 2024. Front Vet Sci. 2024. PMID: 39497747 Free PMC article.
-
New insights into the spatial variability of microbial diversity and density in peatlands exposed to various electron acceptors with an emphasis on methanogenesis and CO2 fluxes.Front Microbiol. 2024 Oct 15;15:1468344. doi: 10.3389/fmicb.2024.1468344. eCollection 2024. Front Microbiol. 2024. PMID: 39473851 Free PMC article.
-
Altered Gut Microbiota Patterns in Young Children with Recent Maltreatment Exposure.Biomolecules. 2024 Oct 16;14(10):1313. doi: 10.3390/biom14101313. Biomolecules. 2024. PMID: 39456245 Free PMC article.
-
Natural Cross-Kingdom Spread of Apple Scar Skin Viroid from Apple Trees to Fungi.Cells. 2022 Nov 20;11(22):3686. doi: 10.3390/cells11223686. Cells. 2022. PMID: 36429116 Free PMC article.
-
Microbiome Analysis of Traditional Grain Vinegar Produced under Different Fermentation Conditions in Various Regions in Korea.Foods. 2022 Nov 10;11(22):3573. doi: 10.3390/foods11223573. Foods. 2022. PMID: 36429165 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical