ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data
- PMID: 33706720
- PMCID: PMC7953547
- DOI: 10.1186/s12859-021-04038-2
ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data
Abstract
Background: Metagenomics is the study of microbial genomes for pathogen detection and discovery in human clinical, animal, and environmental samples via Next-Generation Sequencing (NGS). Metagenome de novo sequence assembly is a crucial analytical step in which longer contigs, ideally whole chromosomes/genomes, are formed from shorter NGS reads. However, the contigs generated from the de novo assembly are often very fragmented and rarely longer than a few kilo base pairs (kb). Therefore, a time-consuming extension process is routinely performed on the de novo assembled contigs.
Results: To facilitate this process, we propose a new tool for metagenome contig extension after de novo assembly. ContigExtender employs a novel recursive extending strategy that explores multiple extending paths to achieve highly accurate longer contigs. We demonstrate that ContigExtender outperforms existing tools in synthetic, animal, and human metagenomics datasets.
Conclusions: A novel software tool ContigExtender has been developed to assist and enhance the performance of metagenome de novo assembly. ContigExtender effectively extends contigs from a variety of sources and can be incorporated in most viral metagenomics analysis pipelines for a wide variety of applications, including pathogen detection and viral discovery.
Keywords: De novo assembly; Metagenomics; Next-Gen Sequencing; Pathogen detection; Viral discovery.
Conflict of interest statement
The authors declare that they have no competing interests.
Figures



Similar articles
-
An ensemble strategy that significantly improves de novo assembly of microbial genomes from metagenomic next-generation sequencing data.Nucleic Acids Res. 2015 Apr 20;43(7):e46. doi: 10.1093/nar/gkv002. Epub 2015 Jan 13. Nucleic Acids Res. 2015. PMID: 25586223 Free PMC article.
-
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.BMC Bioinformatics. 2021 Jul 22;22(Suppl 10):378. doi: 10.1186/s12859-021-04284-4. BMC Bioinformatics. 2021. PMID: 34294039 Free PMC article.
-
ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads.Bioinformatics. 2018 Mar 15;34(6):928-935. doi: 10.1093/bioinformatics/btx702. Bioinformatics. 2018. PMID: 29106455
-
Genome-resolved metagenomics using environmental and clinical samples.Brief Bioinform. 2021 Sep 2;22(5):bbab030. doi: 10.1093/bib/bbab030. Brief Bioinform. 2021. PMID: 33758906 Free PMC article. Review.
-
MEGAHIT v1.0: A fast and scalable metagenome assembler driven by advanced methodologies and community practices.Methods. 2016 Jun 1;102:3-11. doi: 10.1016/j.ymeth.2016.02.020. Epub 2016 Mar 21. Methods. 2016. PMID: 27012178 Review.
Cited by
-
Sentinel Surveillance reveals phylogenetic diversity and detection of linear plasmids harboring vanA and optrA among enterococci collected in the United States.Antimicrob Agents Chemother. 2024 Nov 6;68(11):e0059124. doi: 10.1128/aac.00591-24. Epub 2024 Oct 15. Antimicrob Agents Chemother. 2024. PMID: 39404260 Free PMC article.
-
SourceFinder: a Machine-Learning-Based Tool for Identification of Chromosomal, Plasmid, and Bacteriophage Sequences from Assemblies.Microbiol Spectr. 2022 Dec 21;10(6):e0264122. doi: 10.1128/spectrum.02641-22. Epub 2022 Nov 15. Microbiol Spectr. 2022. PMID: 36377945 Free PMC article.
-
Highly divergent CRESS DNA and picorna-like viruses associated with bleached thalli of the green seaweed Ulva.Microbiol Spectr. 2023 Sep 19;11(5):e0025523. doi: 10.1128/spectrum.00255-23. Online ahead of print. Microbiol Spectr. 2023. PMID: 37724866 Free PMC article.
-
Virseqimprover: an integrated pipeline for viral contig error correction, extension, and annotation.PeerJ. 2025 Jan 10;13:e18515. doi: 10.7717/peerj.18515. eCollection 2025. PeerJ. 2025. PMID: 39807156 Free PMC article.
-
Remnant of Unrelated Amniote Sex Chromosomal Linkage Sharing on the Same Chromosome in House Gecko Lizards, Providing a Better Understanding of the Ancestral Super-Sex Chromosome.Cells. 2021 Nov 1;10(11):2969. doi: 10.3390/cells10112969. Cells. 2021. PMID: 34831192 Free PMC article.
References
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources